Analysing the Integration of Semantic Web Features for Document Planning across Genres

نویسندگان

  • Marta Vicente
  • Elena Lloret
چکیده

Language is usually studied and analysed from different disciplines generally on the premise that it constitutes a form of communication which pursues a specific objective. The discourse, in that sense, can be understood as a text which is constructed to express such objective. When a discourse is created, its production is related to some textual genre, usually connected with some pragmatic features, like the intention of the writer or the audience to whom is addressed, both conditioning the use of language. But genres can be considered as well as compounds of different pieces of text with a certain degree of order, each one seeking for more concrete objectives. This paper presents a proposal to learn such features as a way to generate richer document plans, applying clustering techniques over annotated documents. 1 Motivation and Research Context The current research is carried out from a conception of Natural Language Generation (NLG) for which the creation of a text requires an intermediate output called a document plan. It is by the macroplanning stage that the system provides this plan of selected and ordered content. At present, our work is focused on how to elaborate that plan in order to meet some requisites regarding flexibility of the system: it should be able to produce different outcomes conditioned by the communicative goal, the audience,... the context, on the whole. Henceforth, the main aim of our current research is to enrich the pragmatic facet of the NLG process. The expected outcome is a scheme or ordering of the ideas that should be realised in a set of cohesive and coherent sentences and paragraphs. According to some theories of the discourse (Bakhtin, 2010; Halliday et al., 2014), genres can be understood as social constructions that settle a connection between the discourse and the situation in which it is produced, reflected both in its structure and its content. According to Swavels (1990): “A genre comprises a class of communicative events, the members of which share some set of communicative purposes. These purposes are recognised by the expert members of the parent discourse community, and thereby constitute the rationale for the genre. This rationale shapes the schematic structure of the discourse and influences and constraints choice of content and style.” Besides, genres become interesting because they are related to communicative purposes in different manners, from a global viewpoint to fine-grained levels. As an example, we can think on the case of a person who is looking for recommendation in review pages. Recommending would be the main, global purpose of the text he consults when it was created. But it is possible that the writer also wanted to explain the motivation of the journey narrative, personal experience or to describe the facilities in order to complete his review. Narration, description, recommendation,... they represent low-level functions of the text related to the intention of the writer and, in some cases, they can be identified as different sets of sentences. This lead us to the possibility of learning the structure of the text and its features, which differs from one genre to another. In reviews, the presence and order of the parts is not strict. Maybe one traveller does not share his personal story, but also he describes the room and recommends the brand, while another one first evaluates and then describes. An example to illustrate this can be found in table 1. Conversely, it would make no sense to write a scientific article that reports the results before explaining the methodology or not explaining it at all, for example.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Public Transport Ontology for Passenger Information Retrieval

Passenger information aims at improving the user-friendliness of public transport systems while influencing passenger route choices to satisfy transit user’s travel requirements. The integration of transit information from multiple agencies is a major challenge in implementation of multi-modal passenger information systems. The problem of information sharing is further compounded by the multi-l...

متن کامل

Adaptive Information Analysis in Higher Education Institutes

Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...

متن کامل

Adaptive Information Analysis in Higher Education Institutes

Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...

متن کامل

A Joint Semantic Vector Representation Model for Text Clustering and Classification

Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...

متن کامل

An Executive Approach Based On the Production of Fuzzy Ontology Using the Semantic Web Rule Language Method (SWRL)

Today, the need to deal with ambiguous information in semantic web languages is increasing. Ontology is an important part of the W3C standards for the semantic web, used to define a conceptual standard vocabulary for the exchange of data between systems, the provision of reusable databases, and the facilitation of collaboration across multiple systems. However, classical ontology is not enough ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016